Fast Data Anonymization with Low Information Loss

نویسندگان

Gabriel Ghinita

Panagiotis Karras

Panos Kalnis

Nikos Mamoulis

چکیده

Recent research studied the problem of publishing microdata without revealing sensitive information, leading to the privacy preserving paradigms of k-anonymity and `-diversity. k-anonymity protects against the identification of an individual’s record. `-diversity, in addition, safeguards against the association of an individual with specific sensitive information. However, existing approaches suffer from at least one of the following drawbacks: (i) The information loss metrics are counter-intuitive and fail to capture data inaccuracies inflicted for the sake of privacy. (ii) `-diversity is solved by techniques developed for the simpler k-anonymity problem, which introduces unnecessary inaccuracies. (iii) The anonymization process is inefficient in terms of computation and I/O cost. In this paper we propose a framework for efficient privacy preservation that addresses these deficiencies. First, we focus on one-dimensional (i.e., single attribute) quasiidentifiers, and study the properties of optimal solutions for k-anonymity and `-diversity, based on meaningful information loss metrics. Guided by these properties, we develop efficient heuristics to solve the one-dimensional problems in linear time. Finally, we generalize our solutions to multi-dimensional quasi-identifiers using space-mapping techniques. Extensive experimental evaluation shows that our techniques clearly outperform the state-of-the-art, in terms of execution time and information loss.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Utility-preserving anonymization for health data publishing

BACKGROUND Publishing raw electronic health records (EHRs) may be considered as a breach of the privacy of individuals because they usually contain sensitive information. A common practice for the privacy-preserving data publishing is to anonymize the data before publishing, and thus satisfy privacy models such as k-anonymity. Among various anonymization techniques, generalization is the most c...

متن کامل

Trading Privacy for Information Loss in the Blink of an Eye

The publishing of data with privacy guarantees is a task typically performed by a data curator who is expected to provide guarantees for the data he publishes in quantitative fashion, via a privacy criterion (e.g., k-anonymity, l-diversity). The anonymization of data is typically performed off-line. In this paper, we provide algorithmic tools that facilitate the negotiation for the anonymizatio...

متن کامل

k-anonymity based framework for privacy preserving data collection in wireless sensor networks

In this paper, k-anonymity notion is adopted to be used in wireless sensor networks (WSN) as a security framework with two levels of privacy. A base level of privacy is provided for the data shared with semitrusted sink and a deeper level of privacy is provided against eavesdroppers. In the proposed method, some portions of data are encrypted and the rest is generalized. Generalization shortens...

متن کامل

Privacy Preserving Data Publishing Based on k-Anonymity by Categorization of Sensitive Values

In many organizations large amount of personal data are collected and analyzed by the data miner for the research purpose. However, the data collected may contain sensitive information which should be kept confidential. The study of Privacypreserving data publishing (PPDP) is focus on removing privacy threats while, at the same time, preserving useful information in the released data for data m...

متن کامل

Novel Approaches for Privacy Preserving Data Mining in k-Anonymity Model

In privacy preserving data mining, anonymization based approaches have been used to preserve the privacy of an individual. Existing literature addresses various anonymization based approaches for preserving the sensitive private information of an individual. The k-anonymity model is one of the widely used anonymization based approach. However, the anonymization based approaches suffer from the ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Fast Data Anonymization with Low Information Loss

نویسندگان

چکیده

منابع مشابه

Utility-preserving anonymization for health data publishing

Trading Privacy for Information Loss in the Blink of an Eye

k-anonymity based framework for privacy preserving data collection in wireless sensor networks

Privacy Preserving Data Publishing Based on k-Anonymity by Categorization of Sensitive Values

Novel Approaches for Privacy Preserving Data Mining in k-Anonymity Model

عنوان ژورنال:

اشتراک گذاری